Inner TRIM3 Masthead

Example of Collapsing CPS-TRIM3 Replicate Households to Single Households for a Merge with CPS ASEC Data

The default TRIM3 data that are used to generate baseline eligibility estimates are based on CPS ASEC data that contain some replicate households. Some high-income households have been replicated for a match with SOI data and others have been replicated for an immigrant status imputation. TRIM3 variables HighIncomeClone and AlienHouseholdSplit identify replicate households. PersonWeight is the CPS ASEC person weight (MARSUPWT) that--for persons in replicate households--has been adjusted to account for the replication. Population weight totals are preserved, including for population sub-groups. The TRIM3 variable OldIdentifier is the CPS ASEC variable H_SEQ, and the TRIM3 variable CPSPersonID is the CPS ASEC variable PPPOS. HouseholdID and PersonID are TRIM3 household and person identifiers. Note: You may use TRIM3 variable LineNumber (CPS ASEC variable LINENO) instead of CPSPersonID (ASEC variable PPPOS) for this procedure.

The example below provides a method for computing average weighted SNAP benefits calculated over all iterations of a person in "alien" TRIM3 data that contains replicate households.

1) Create a dataset from your TRIM3 extract. Something similar to the following might be used for an extract that is in comma-delimited ASCII format.

data one;

infile {extract file name} delimiter=',';

INPUT
HouseholdID
PersonID
OldIdentifier
CPSPersonID
PersonWeight
AnnualBenefitsReceived /* Annual SNAP benefit */
;
run;

Alternatively, if your extract is in SAS 9 format, your statement would simply be something like the following.
data one;
set microdata_jnk;
run;

2) Sort the dataset by OldIdentifier and CPSPersonID.

proc sort data=one;
by OldIdentifier CPSPersonID;
run;

3) Create a dataset having just one record for each person in the original CPS ASEC data. The record will contain the weighted average benefit calculated over all of the iterations of each person in the TRIM3 extract.

data new (keep=OldIdentifier CPSPersonID sumwgt avgben);
set one;
by OldIdentifier CPSPersonID;

/* Sum the weights and the weighted SNAP benefits over all iterations of the person. */
retain sumwgt sumben;

if first.CPSPersonID then do;
sumwgt=0;
sumben=0;
end;
sumwgt=sumwgt+PersonWeight;
sumben=sumben+(AnnualBenefitsReceived*PersonWeight);

/* If this is the last record for this person, calculate the average weighted benefit and output the record. */
if last.CPSPersonID then do;
avgben=sumben/sumwgt;
output;
end;
run;

4) Perform the following checks.

The "new" dataset created above should have the same number of records as your person-level non-replicated CPS ASEC dataset.

The weighted sum of SNAP benefits in the "one" dataset and the weighted sum of average SNAP benefits in the "new" dataset should be identical and may be tested using the following proc means statements.

Proc means data=one sum;
var AnnualBenefitsReceived;
weight PersonWeight;
run;

Proc means data=new sum;
var avgben;
weight sumwgt;
run;

5) Once these steps have been successfully completed, you may merge the "new" dataset with non-replicated person-level CPS ASEC data. Merge by OldIdentifier (H_SEQ) and CPSPersonID (PPPOS). A good check would be to ascertain that sumwgt is equal (or very nearly equal) to person weight in the non-replicated data. You can check that by generating a mean of the absolute difference between the two weights.

Since avgben is a person-level variable, you will have to sum avgben for all persons in a family to compute the amount of SNAP received by a family.